10 research outputs found

    Design and Real-World Application of Novel Machine Learning Techniques for Improving Face Recognition Algorithms

    Get PDF
    Recent progress in machine learning has made possible the development of real-world face recognition applications that can match face images as good as or better than humans. However, several challenges remain unsolved. In this PhD thesis, some of these challenges are studied and novel machine learning techniques to improve the performance of real-world face recognition applications are proposed. Current face recognition algorithms based on deep learning techniques are able to achieve outstanding accuracy when dealing with face images taken in unconstrained environments. However, training these algorithms is often costly due to the very large datasets and the high computational resources needed. On the other hand, traditional methods for face recognition are better suited when these requirements cannot be satisfied. This PhD thesis presents new techniques for both traditional and deep learning methods. In particular, a novel traditional face recognition method that combines texture and shape features together with subspace representation techniques is first presented. The proposed method is lightweight and can be trained quickly with small datasets. This method is used for matching face images scanned from identity documents against face images stored in the biometric chip of such documents. Next, two new techniques to increase the performance of face recognition methods based on convolutional neural networks are presented. Specifically, a novel training strategy that increases face recognition accuracy when dealing with face images presenting occlusions, and a new loss function that improves the performance of the triplet loss function are proposed. Finally, the problem of collecting large face datasets is considered, and a novel method based on generative adversarial networks to synthesize both face images of existing subjects in a dataset and face images of new subjects is proposed. The accuracy of existing face recognition algorithms can be increased by training with datasets augmented with the synthetic face images generated by the proposed method. In addition to the main contributions, this thesis provides a comprehensive literature review of face recognition methods and their evolution over the years. A significant amount of the work presented in this PhD thesis is the outcome of a 3-year-long research project partially funded by Innovate UK as part of a Knowledge Transfer Partnership between University of Hertfordshire and IDscan Biometrics Ltd (partnership number: 009547)

    Shape and Texture Combined Face Recognition for Detection of Forged ID Documents

    Get PDF
    This paper proposes a face recognition system that can be used to effectively match a face image scanned from an identity (ID) doc-ument against the face image stored in the biometric chip of such a document. The purpose of this specific face recognition algorithm is to aid the automatic detection of forged ID documents where the photography printed on the document’s surface has been altered or replaced. The proposed algorithm uses a novel combination of texture and shape features together with sub-space representation techniques. In addition, the robustness of the proposed algorithm when dealing with more general face recognition tasks has been proven with the Good, the Bad & the Ugly (GBU) dataset, one of the most challenging datasets containing frontal faces. The proposed algorithm has been complement-ed with a novel method that adopts two operating points to enhance the reliability of the algorithm’s final verification decision.Final Accepted Versio

    Enhancing Convolutional Neural Networks for Face Recognition with Occlusion Maps and Batch Triplet Loss

    No full text
    Despite the recent success of convolutional neural networks for computer vision applications, unconstrained face recognition remains a challenge. In this work, we make two contributions to the field. Firstly, we consider the problem of face recognition with partial occlusions and show how current approaches might suffer significant performance degradation when dealing with this kind of face images. We propose a simple method to find out which parts of the human face are more important to achieve a high recognition rate, and use that information during training to force a convolutional neural network to learn discriminative features from all the face regions more equally, including those that typical approaches tend to pay less attention to. We test the accuracy of the proposed method when dealing with real-life occlusions using the AR face database. Secondly, we propose a novel loss function called batch triplet loss that improves the performance of the triplet loss by adding an extra term to the loss function to cause minimisation of the standard deviation of both positive and negative scores. We show consistent improvement in the Labeled Faces in the Wild (LFW) benchmark by applying both proposed adjustments to the convolutional neural network training.Peer reviewe

    Generating photo-realistic training data to improve face recognition accuracy

    Get PDF
    Face recognition has become a widely adopted biometric in forensics, security and law enforcement thanks to the high accuracy achieved by systems based on convolutional neural networks (CNNs). However, to achieve good performance, CNNs need to be trained with very large datasets which are not always available. In this paper we investigate the feasibility of using synthetic data to augment face datasets. In particular, we propose a novel generative adversarial network (GAN) that can disentangle identity-related attributes from non-identity-related attributes. This is done by training an embedding network that maps discrete identity labels to an identity latent space that follows a simple prior distribution, and training a GAN conditioned on samples from that distribution. A main novelty of our approach is the ability to generate both synthetic images of subjects in the training set and synthetic images of new subjects not in the training set, both of which we use to augment face datasets. By using recent advances in GAN training, we show that the synthetic images generated by our model are photo-realistic, and that training with datasets augmented with those images can lead to increased recognition accuracy. Experimental results show that our method is more effective when augmenting small datasets. In particular, an absolute accuracy improvement of 8.42% was achieved when augmenting a dataset of less than 60k facial images.Peer reviewe

    Sign Language Motion Generation from Sign Characteristics

    Full text link
    This paper proposes, analyzes, and evaluates a deep learning architecture based on transformers for generating sign language motion from sign phonemes (represented using HamNoSys: a notation system developed at the University of Hamburg). The sign phonemes provide information about sign characteristics like hand configuration, localization, or movements. The use of sign phonemes is crucial for generating sign motion with a high level of details (including finger extensions and flexions). The transformer-based approach also includes a stop detection module for predicting the end of the generation process. Both aspects, motion generation and stop detection, are evaluated in detail. For motion generation, the dynamic time warping distance is used to compute the similarity between two landmarks sequences (ground truth and generated). The stop detection module is evaluated considering detection accuracy and ROC (receiver operating characteristic) curves. The paper proposes and evaluates several strategies to obtain the system configuration with the best performance. These strategies include different padding strategies, interpolation approaches, and data augmentation techniques. The best configuration of a fully automatic system obtains an average DTW distance per frame of 0.1057 and an area under the ROC curve (AUC) higher than 0.94

    Sign Language Dataset for Automatic Motion Generation

    Full text link
    Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37

    Mural Endocarditis: The GAMES Registry Series and Review of the Literature

    No full text
    corecore